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Abstract — It is now well understood that Hi minimization 
algoritlim is able to recover sparse signals from incomplete 
measurements [2], [1], [3] and sharp recoverable sparsity thresh- 
olds have also been obtained for the Hi minimization algorithm. 
However, even though iterative reweighted minimization al- 
gorithms or related algorithms have been empirically observed 
to boost the recoverable sparsity thresholds for certain types of 
signals, no rigorous theoretical results have been established to 
prove this fact. In this paper, we try to provide a theoretical 
foundation for analyzing the iterative reweighted £i algorithms. 
In particular, we show that for a nontrivial class of signals, 
the iterative reweighted i?i minimization can indeed deliver 
recoverable sparsity thresholds larger than that given in [1], [3]. 
Our results are based on a high-dimensional geometrical analysis 
(Grassmann angle analysis) of the null-space characterization for 
li minimization and weighted (.i minimization algorithms. 

Index Terms: compressed sensing, basis pursuit, Grassmann 
angle, reweighted £i minimization, random linear subspaces 

I. Introduction 

In this paper we are interested in compressed sensing 
problems. Namely, we would like to find x such that 

Ax-y (1) 
where A is an m x n (m < n) measurement matrix, y is a 
m X 1 measurement vector and x is an n x 1 unknown vector 
with only k {k < m) nonzero components. We will further 
assume that the number of the measurements is m = 5n and 
the number of the nonzero components of x is fc = (n, where 
< C < 1 and < (5 < 1 are constants independent of n 
(clearly, S > 0- 

A particular way of solving ^ which has recently generated 
a large amount of research is called £i-optimization (basis 
pursuit) [2]. It proposes solving the following problem 



mm ||x||i 
subject to Ax = y. 



(2) 



Quite remarkably in [2] the authors were able to show that if 
the number of the measurements is m = Sn and if the matrix 
A satisfies a special property called the restricted isometry 
property (RIP), then any unknown vector x with no more 
than k = C,n (where C is an absolute constant which is a 
function of 5, but independent of n, and explicitly bounded 
in [2]) non-zero elements can be recovered by solving (|2]l. 



Instead of characterizing the mxn matrix A through the RIP 
condition, in [1], [3] the authors assume that A constitutes 
a fc-neighborly polytope. It turns out (as shown in [1]) that 
this characterization of the matrix A is in fact a necessary 
and sufficient condition for ^ to produce the solution of ([TJ. 
Furthermore, using the results of [4] [7] [8], it can be shown 
that if the matrix A has i.i.d. zero-mean Gaussian entries with 
overwhelming probability it also constitutes a fc-neighborly 
polytope. The precise relation between m and k in order for 
this to happen is characterized in [1] as well. 

In this paper we will be interested in providing the the- 
oretical guarantees for the emerging iterative reweighted £i 
algorithms [16]. These algorithms iteratively updated weights 
for each element of x in the objective function of £i minimiza- 
tion, based on the decoding results from previous iterations. 
Experiments showed that the iterative reweighted £i algo- 
rithms can greatly enhance the recoverable sparsity threshold 
for certain types of signals, for example, sparse signals with 
Gaussian entries. However, no rigorous theoretical results have 
been provided for establishing this phenomenon. To quote 
from [16], "any result quantifying the improvement of the 
reweighted algorithm for special classes of sparse or nearly 
sparse signals would be significant". In this paper, we try 
to provide a theoretical foundation for analyzing the iterative 
reweighted ii algorithms. In particular, we show that for a 
nontrivial class of signals, (It is worth noting that empirically, 
the iterative reweighted £i algorithms do not always improve 
the recoverable sparsity thresholds, for example, they often 
fail to improve the recoverable sparsity thresholds when the 
non-zero elements of the signals are "flat" [16]), a modified 
iterative reweighted £i minimization algorithm can indeed 
deliver recoverable sparsity thresholds larger than those given 
in [1], [3] for unweighted £i minimization algorithms. Our 
results are based on a high-dimensional geometrical analysis 
(Grassmann angle analysis) of the null-space characterization 
for £i minimization and weighted £i minimization algorithms. 
The main idea is to show that the preceding £i minimization 
iterations can provide certain information about the support set 
of the signals and this support set information can be properly 
taken advantage of to perfectly recover the signals even though 
the sparsity of the signal x itself is large. 

This paper is structured as follows. In SectionHH we present 



the iterative reweighted £i algorithm for analysis. The signal 
model for x will be given in Section |III] In Section |IV] 
and Section |V] we will show how the iterative reweighted 
£i minimization algorithm can indeed improve recoverable 
sparsity thresholds. Numerical results will be given in Section 

m 

II. The Modified Iterative Reweighted £i 
Minimization Algorithm 

Let wl, i = 1, ...,n, denote the weights for the i-th element 
Xi of X in the t-tii iteration of the iterative reweighted £i 
minimization algorithm and let W* be the diagonal matrix with 
u>* , on the diagonal. In the paper [16], the following 

iterative reweighted £i minimization algorithm is presented: 

Algorithm 1: [16] 

1) Set the iteration count t to zero and wl — 1, i ~ 1, n. 

2) Solve the weighted minimization problem 

X* = argmin ||W*x|| subject to y = Ax.. (3) 

3) Update the weights: for each i — 1, ...,n, 



where e' is a tunable positive number. 
4) Terminate on convergence or when t attains a specified 
maximum number of iterations imax- Otherwise, incre- 
ment t and go to step 2. 

For the sake of tractable analysis, we will give another 
iterative reweighted £i minimization algorithm , but it still 
captures the essence of the reweighted ti algorithm presented 
in [16]. In our modified algorithm, we only do two £i 
minimization programming, namely we stop at the time index 
t = 1. 

Algorithm 2: 1) Set the iteration count t to zero and 
w\ = 1, i — \, ...,n. 

2) Solve the weighted ti minimization problem 

X* = argmin ||W*x|| subject to y = Ax. (5) 

3) Update the weights: find the index set K' C {1, 2, n} 
which corresponds to the largest (1 — e)pp{5)5n el- 
ements of x° in amplitudes, where < e < 1 is a 
specified parameter and pf{5) is the weak threshold for 
perfect recovery defined in [1] using ii minimization 
(thus C = pp{S)S is the weak sparsity threshold). Then 
assign the weight Wi = 1 to those w*^^ corresponding 
to the set K' and assign the weight W2 = W, W > 1, 
to those wf^^ corresponding to the complementary set 
K' = {l,2,...,n}\K\ 

4) Terminate on convergence or when < = 1. Otherwise, 
increment t and go to step 2. 

This modified algorithm is certainly different from the 
algorithm from [16], but the important thing is that both 
algorithms assign bigger weights to those elements of x which 
are more likely to be 0. 



III. Signal Model for x 

In this paper, we consider the following model for the n- 
dimensional sparse signal x. First of all, we assume that there 
exists a set K C {l,2,...,n} with cardinality \K\ — (1 — 
£)pF{S)dn such that each of the elements of x over the set K 
is large in amplitude. W.L.O.G., those elements are assumed to 
be all larger than ai > 0. For a given signal x, one might take 
such set K to be the set corresponding to the (1 — e)pp{5)5n 
largest elements of x in amplitude. 

Secondly, (let K = {1,2, ...,71} \ K), we assume that the 
£1 norm of x over the set K, denoted by ||xjf ||i, is upper- 
bounded by A, though A is allowed to take a non-diminishing 
portion of the total £1 norm ||x||i as n ^ 00. We further 
denote the support set of x as Xtotai and its complement as 
^■^totai- The sparsity of the signal x, namely the total number of 
nonzero elements in the signal x is then ji^totail = ^totai — £,n, 
where ^ can be above the weak sparsity threshold ( = pf{5)5 
achievable using the £1 algorithm. 

In the following sections, we will show that if certain 
conditions on ai, A and the measurement matrix A are 
satisfied, we will be able to recover perfectly the signal x using 
Algorithm|2]even though its sparsity level is above the sparsity 
threshold for £1 minimization. Intuitively, this is because the 
weighted £1 minimization puts larger weights on the signal 
elements which are more likely to be zero, and puts smaller 
weights on the signal support set, thus promoting sparsity at 
the right positions. In order to achieve this, we need some 
prior information about the support set of x, which can be 
obtained from the decoding results in previous iterations. We 
will first argue that the equal-weighted £1 minimization of 
Algorithm |2] can sometimes provide very good information 
about the support set of signal x. 

IV. Estimating the Support Set from the £1 
Minimization 

Since the set K' corresponds to the largest elements in the 
decoding results of £1 minimization, one might guess that most 
of the elements in K' are also in the support set K total- The 
goal of this section is to get an upper bound on the cardinality 
of the set K^totai H K' , namely the number of zero elements of 
X over the set K' . To this end, we will first give the notion 
of "weak" robustness for the £1 minimization. 

Let K be fixed and x/f , the value of x on this set, be also 
fixed. Then the solution produced by (|2]), x, will be called 
weakly robust if, for some C > 1 and all possible x^, it 
holds that 

||(x-x)^||i < ^7--yI|x^.||i, 

and 

||xx||i-||x^||i<^||x^||i 

The above "weak" notion of robustness allows us to bound 
the error ||x — x||i in the following way. If the matrix Ak, 
obtained by retaining only those columns of A that are indexed 



by K, has full column rank, then the quantity 

I|wk||i 

K = max -J— , 

Aw=0,W5<iO ||w^ II 1 

must be finite (k < oo). In particular, since x — x is in the 
null space of yl (y = Ax. = Ak), we have 



■XI 



< 



< 



|Kx-x)a'||i + |Kx-x)^||i 
(l + /^)||(x-x),^||i 
2C(1 + k)„ „ 



C- 1 



'-Kill. 



thus bounding the recovery error. We can now give necessary 
and sufficient conditions on the measurement matrix A to 
satisfy the notion of weak robustness for ti minimization. 

Theorem 1: For a given C > 1, support set K, and x/f , the 
solution X produced by (|2]i will be weakly robust if, and only 
if, Vvir e R" such that Aw ~ 0, we have 



XA' + WA 1 



-I 

C ' 



> IIxaI 



(6) 



Proof: Sufficiency: Let w = x — x, for which Aw = 
A(x — x) = 0. Since x is the minimum £i norm solution. 



we have ||x||i > 



> IIxkI 



|x||i = ||x + w||i, and therefore 1|xa-1|i 
I- ||x^||i. Thus, 



||Xa||i - ||XA +WA||i > I|Wa +Xa||i - llXAlli 
> ||w^.||i-2||x^||i. 

But the condition (|6]l guarantees that 



|wa||i>C(||xa||i-||xa+wk||i), 



so we have 



and 



'K\\l 



< 



2C 

C-1 



|xa'IIi> 



IXAlll 



IIxkIIi < 



C-1' 
as desired. 

Necessity: Since in the above proof of the sufficiency, 
equalities can be achieved in the triangular inequalities, the 
condition ^ is also a necessary condition for the weak 
robustness to hold for every x. (Otherwise, for certain x's, 
there will be x' = x + w with ||x'||i < ||x||i while violating 
the respective robustness definitions. Also, such x' can be the 
solution to (|2|i). ■ 

We should remark (without proof for the interest of space) 
that for any (5 > 0, 1 < e < 1, let \K\ = (1 - e)pF{5)5n, and 
suppose each element of the measurement matrix A is sampled 
from i.i.d. Gaussian distribution, then there exists a constant 
C > 1 (as a function of 5 and e), such that the condition 
(|6]l is satisfied with overwhelming probability as the problem 
dimension n oo. At the same time,the parameter k defined 
above is upper-bounded by a finite constant (independent of 
the problem dimension n) with overwhelming probability as 
71 ^ oo. These claims can be shown by using the Grasamann 
angle approach for the balancedness property of random linear 
subspaces in [12]. In the current version of our paper, we 



would make no attempt to explicitly express the parameters C 
and K. 

In Algorithmic after equal-weighted £i minimization, we 
pick the set K' corresponding to the (1 — e)pF{S)S largest 
elements in amplitudes from the decoding result x (namely x'' 
in the algorithm description) and assign the weights Wi = 1 to 
the corresponding elements in the next iteration of reweighted 
£i minimization. Now we can show that an overwhelming 
portion of the set K' are also in the support set Kiotai of x if the 
measurement matrix A satisfies the specified weak robustness 
property 

Theorem 2: Supposed that we are given a signal vector 
X e i?" satisfying the signal model defined in Section |lll] 
Given 6 > 0, and a measurement matrix A which satisfies 
the weak robustness condition in Q with its corresponding 
C > 1 and K < oo, then the set K' generated by the equal- 
weighted £i minimization in Algorithm 2 contains at most 
(C-?)^ I|xaIIi + (c-'^)'^ I|xaIIi indices which are outside 
the support set of signal x. 

Proof: Since the measurement matrix A satisfies the 
weak robustness condition for the set K and the signal x. 



(x-x) 



k\\i 



< 



2C 



XaIIi- 



C-1 

By the definition of the k < oo, namely, 

I|wa||i 

K = max —, 

A-w=o,w^a IIwaIIi 



we have 



(x-x)aIIi < k||(x-x)a||i. 



Then there are at most , '^'r'. ai 

(G— ij — 



'-A' 111 



indices that are 



outside the support set of x but have amplitudes larger than ^ 
in the corresponding positions of the decoding result x from 
the equal-weighted £i minimization algorithm. This bound 
follows easily from the facts that all such indices are in the 



set K and that 



< 



2C 



111- 



,AX-xjAI|l ^ c=T 
Similarly, there are at most 77^377^ I|xa||i indices which 
are originally in the set K but now have corresponding 
amplitudes smaller than ^ in the decoded result x of the 
equal-weighted £1 algorithm. 

Since the set K' corresponds to the largest {1 — e)pp{S)dn 
elements of the signal x, by combining the previous two 
results, it is not hard to see that the number of indices which 
are outside the support set of x but are in the set K' is no 



bigger than 



2C 



2Ck 



(C-l)- 



IxaIIi- 



As we can see. Theorem |2] provides useful information 
about the support set of the signal x, which can be used in 
the analysis for the weighted 11 minimization using the null- 
space Grassmann Angle analysis approach for weighted £1 
minimization algorithm [13]. 



V. The Grassmann Angle Approach for the 
Reweighted Ix Minimization 

In the previous work [13], the authors have shown that by 
exploiting certain prior information about the original signal, 
it is possible to extend the threshold of sparsity factor for 
successful recovery beyond the original bounds of [1], [3]. 
The authors proposed a nonuniform sparsity model in which 
the entries of the vector x can be considered as T different 
classes, where in the ith class, each entry is (independently 
from others) nonzero with probability Pi, and zero with 
probability 1 — Pi. The signals generated based on this model 
will have around niPi + ■ ■ ■ + nTPT nonzero entries with high 
probability, where Ui is the size of the ith class. Examples of 
such signals arise in many applications as medical or natural 
imaging, satellite imaging, DNA micro-arrays, network moni- 
toring and so on. They prove that provided such structural prior 
information is available about the signal, a proper weighted £i- 
minimization strictly outperforms the regular i'l -minimization 
in recovering signals with some fixed average sparsity from 
under-determined linear i.i.d. Gaussian measurements. 

The detailed analysis in [13] is only done for T — 2, and is 
based on the high dimensional geometrical interpretations of 
the constrained weighted £i -minimization problem: 

n 

min y^Wi|xi| 

Let the two classes of entries be denoted by Ki and K2- Also, 
due to the partial symmetry, for any suboptimal set of weights 
{wi, ■ ■ ■ .w„} we have the following 



V*e{l,2, 



Wiif ie Ki 
W2 if i e K2 



The following theorem is implicitly proven in [13] and more 
explicitly stated and proven in [14] 

Theorem 3: Let 71 = ^ and 72 — If 71, 72, Pi, 
P2, Wi and W2 are fixed, there exists a critical threshold 
Sc — <5c(7i, 7i, ^'i, ^2, totally computable, such that if 
6 — — > 6c, then a vector x generated randomly based on 
the described nonuniformly sparse model can be recovered 
from the weighted £1 -minimization of |V] with probability 
1 — o(e~'^") for some positive constant c. 

In [13] and [14], a way for computing 6c is presented which, 
in the uniform sparse case (e.g 72 = 0) and equal weights, is 
consistent with the weak threshold of Donoho and Tanner for 
almost sure recovery of sparse signals with ^1 -minimization. 

In summary, given a certain S, the two different weights 
Wi and W2 for weighted £1 minimization, the size of the 
two weighted blocks, and also the number (or proportion) of 
nonzero elements inside each weighted block, the framework 
from [13] can determine whether a uniform random measure- 
ment matrix will be able to perfectly recover the original 
signals with overwhelming probability. Using this framework 
we can now begin to analyze the performance of the modified 
re-weighted algorithm of section HI] Although we are not 
directly given some prior information, as in the nonuniform 



sparse model for instance, about the signal structure, one 
might hope to infer such information after the first step of 
the modified re-weighted algorithm. To this end, note that 
the immediate step in the algorithm after the regular £1- 
minimization is to choose the largest {l — e)pF{6)6n entries in 
absolute value. This is equivalent to splitting the index set of 
the vector x to two classes K' and K", where K' corresponds 
to the larger entries. We now try to find a correspondence 
between this setup and the setup of [13] where sparsity factors 
on the sets K' and K' are known. We claim the following 
upper bound on the number of nonzero entries of x with index 
on K' 

Theorem 4: There at at least (1 — e)pp{6)6n — ^^q'^^I^ 
nonzero entries in x with index on the set K' . 

Proof: Directly from Theorem |2] and the fact that 
l|xA-||i<A. ■ 

The above result simply gives us a lower bound on the 
sparsity factor (ratio of nonzero elements) in the vector x^f ' 

{C - \)aipF{5)5 n 

Since we also know the original sparsity of the signal, 
||x||o < fctotai, we have the following lower bound on the 
sparsity factor of the second block of the signal x^/ 

4C(k+1)A 



P2 < 



^-total 



71 — (1 — e)pFi6)6n 

Note that if ai is large and 1 ^ (Note however, we can 
let A take a non-diminishing portion of ||x|| 1, even though that 
portion can be very small), then Pi is very close to 1. This 
means that the original signal is much denser in the block 
K' than in the second block f('. Therefore, as in the last 
step of the modified re-weighted algorithm, we may assign a 
weight Wi = 1 to all entries of x in K' and weight W2 — W, 
14^ > 1 to the entries of x in fC' and perform the weighted £1- 
minimization. The theoretical results of [13], namely Theorem 
[3] guarantee that as long as 6 > (71, 72, -Pi, ^2, ^) then the 
signal will be recovered with overwhelming probability for 
large n. The numerical examples in the next Section do show 
that the reweighted £1 algorithm can increase the recoverable 
sparsity threshold, i.e. P171 + ^272- 

VI. Numerical Computations on the Bounds 

Using numerical evaluations similar to those in [13], we 
demonstrate a strict improvement in the sparsity threshold 
from the weak bound of [1], for which our algorithm is 
guaranteed to succeed. Let 6 = 0.555 and ^ be fixed, which 
means that ( ~ Pf{6)6 is also given. We set e — 0.01. 
The sizes of the two classes K' and K' would then be 
71 n = {1 — e)Cn and 72?! = (1 — 7i)n respectively. The 
sparsity ratios Pi and P2 of course depend on other parameters 
of the original signal, as is given in equations ([V| and dVT ). 
For values of Pi close to 1, we search over all pairs of Pi 
and P2 such that the critical threshold (5c (71, 72, -Pi, ^2, ^) is 
strictly less than S. This essentially means that a non-uniform 
signal with sparsity factors Pi and P2 over the sets K' and 
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Fig. 1. Recoverable sparsity factor for S — 0.555, wlien the modified 
re-weighted ^"1 -minimization algorithm is used. 



K' is highly probable to be recovered successfully via the 
weighted £1 -minimization with weights Wi and 14^2- For any 
such Pi and P2, the signal parameters (A, oi) can be adjusted 
accordingly. Eventually, we will be able to recover signals 
with average sparsity factor P171 + P272 using this method. 
We simply plot this ratio as a function of Pi in Figure [T| The 
straight line is the weak bound of [1] for 5 — 0.555 which is 
basically pp{6)6. 
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